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1 Purification Means 

2 

3 The present invention relates to purification means, 

4 in particular to means suitable for use in 

5 purification of soluble proteins. 
6 

7 Introduction 
8 

9 The recombinant production of protein in bacteria, 

10 yeast, insect and mammalian cell lines has become a 

11 cornerstone of biological research and the 

12 biotechnology industry. Classical biochemical and 

13 chromat ©graphical purification techniques usually 

14 produce inadequate amounts of a target protein to 

15 study its roles or actions. Even if enough of the 

16 protein can be purified, it usually involves 

17 cumbersome amounts of starting material or tissue 

18 and many processing steps are taken before 

19 reasonable purification can be achieved. 
20 

21 Recombinant expression of the target protein 

22 bypasses a lot of these, problems . By introducing. 



. V' 



the target protein's gene template to a cell line or 
bacterial culture, induced over express ion can result 
in significant levels of that protein being 
produced. Large amounts of protein make the 
purification a lot simpler, but the addition or 
fusion of purification domains or tags allows for a 
relatively simple one-step purification using 
affinity chromatography resins. However, 
occasionally, due to the varying nature of proteins, 
the production of soluble protein has remained 
elusive with known tags unable to purify many 
proteins- In some cases, production of protein can 
be a problem due to differences in the machinery of 
bacterial cells. There is therefore a need for a 
more versatile tag than is available currently on 
the market. The provision of such a versatile tag 
enabling , for example, improved ability to quickly 
produce and screen soluble protein in bacteria such 
as E.coli would represent a major step forward in 
protein biochemistry. 

Summary of the Invention 

The present inventors have developed a novel 
purification tag based on the gene product of a 
sortase gene, in particular the srtA gene of 
Staphylococcus aureus. This tag, known as SNUT 
[Solubility eNhancing Unique Tag] has been found to 
have exceptional activity, enabling the efficient 
purification of soluble domains of a number of 
proteins hitherto not able to be isolated 
efficiently using conventional purification tags. 
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1 

2 Throughout this specification, reference to a SNUT 

3 Tag should be understood to mean a tag derived from 

4 a sortase gene product. 
5 

6 In a first aspect of the invention, there is 

7 provided a purification tag comprising a sortase, 

8 e.g srtA, gene product. 
9 

10 In preferred embodiments, the sortase gene product 

11 is a gene product of the srtA gene of Staphylococcus 

12 aureus. 
13 

14 Also provided is the use of a sortase, e.g srtA, 

15 gene product as a purification tag. 
16 

17 Furthermore, according to a third aspect of the 

18 invention, there is provided an expression construct 

19 for the production of recombinant polypeptides, 

2 0 which construct comprises an expression cassette 

21 consisting of the following elements that are 

22 operably linked: a) a promoter; b) the coding region 

23 of a DNA encoding a sortase, eg srtA gene product as 

24 a purification tag sequence; c) a cloning site for 

25 receiving the coding region for the recombinant 

26 polypeptide to be produced; and d) transcription 

27 termination signals. 



28 
29 
30 



According to a fourth aspect of the invention, there 
is provided a method for producing a polypeptide, 

31 comprising: a) preparing an expression vector for 

32 the polypeptide to be produced by cloning the. coding 



sequence for the polypeptide into the cloning site 
of an expression construct according to the third 
aspect of the invention; b) transforming a suitable 
host cell with the expression construct thus 
obtained; and c) culturing the host cell under 
conditions allowing expression of a fusion 
polypeptide consisting of the amino acid sequence of 
the purification tag with the amino acid sequence of 
the polypeptide to be expressed covalently linked 
thereto; and, optionally, d) isolating the fusion 
polypeptide from the host cell or the culture medium 
by means of binding the fusion polypeptide present 
therein through the amino acid sequence of the 
purification tag. 

The expression construct, herein referred to as 
pSNUT, may be made by modification of any suitable 
vector to include the coding region of a DNA 
encoding a sortase. In preferred embodiments, the 
expression construct is based on the pQE3 0 plasmid. 

A sample of pSNUT was deposited with the National 
Collections of Industrial and Marine Bacteria Ltd. 
(NCIMB) , 23 St Machar Drive, Aberdeen, Scotland AB24 
3RY on 23 December 2002 under accession no NCIMB 
41153 . 

In a fifth aspect, there is provided a fusion 
polypeptide obtained by the method of the fourth 
aspect of the invention. 



1 In preferred embodiments, the sortase, e.g. 

2 srtA, gene product (SNUT) is encoded by the 

3 nucleotide sequence shown in Figure 4 or a variant 

4 or fragment thereof. Preferably, the srtA gene 

5 product comprises amino acids 26 to 171 of the SrtA 

6 sequence shown in Figure 4 or a variant or fragment 

7 thereof . 
8 

9 Variants and fragments of and for use in the 

10 invention preferably retain the functional 

11 capability of the polypeptide i.e. ability to be 

12 used as a purification tag. Such variants and 

13 fragments which retain the function of the natural 

14 polypeptides, can be prepared according to methods 

15 for altering polypeptide sequence known to one of 

16 ordinary skill in the art such as are found in 

17 references which compile such methods, e.g. 

18 Molecular Cloning: A Laboratory Manual, J*. Sambrook, 

19 et al., eds., Second Edition, Cold Spring Harbor 
2 0 Laboratory Press, Cold Spring Harbor, New York, 

21 1989, or Current Protocols in Molecular Biology, F. 

22 M. Ausubeil, et al . , eds., John Wiley & Sons, Inc., 

23 New York. 
24 

25 A variant nucleic acid molecule shares homology 

26 with, or is identical to, all or part of the coding 

27 sequence discussed above. Generally, variants may 

28 encode, or be used to isolate or amplify nucleic 

2 9 acids which encode, polypeptides which are capable 

30 of ability to be used as a purification tag. 

31 



1 Variants of the present invention can be artificial 

2 nucleic acids (i. e. containing sequences which have 

3 not originated naturally) which can be prepared by 

4 the skilled person in the light of the present 

5 disclosure. Alternatively they may be novel, 

6 naturally occurring, nucleic acids, which may be 

7 isolatable using the sequences of the present 

8 invention. Thus a variant may be a distinctive part 

9 or fragment (however produced) corresponding to a 

10 portion of the sequence provided in Figure 4. The 

11 fragments may encode particular functional parts of 

12 the polypeptide. 
13 

14 The fragments may have utility in probing for, or 

15 amplifying, the sequence provided or closely related 

16 ones . 
17 

18 Sequence variants which occur naturally may include 

19 alleles or other homologues (which may include 

2 0 polymorphisms or mutations at one or more bases) . 

21 Artificial variants (derivatives) may be prepared by 

22 those skilled in the art , for instance by site 

23 directed or random mutagenesis, or by direct 

24 synthesis. Preferably the variant nucleic acid is 

25 generated either directly or indirectly (e. g. via 

26 one or amplification or replication steps) from an 
2 7 original nucleic acid having all or part of the 

2 8 sequences of Figure 4. Preferably it encodes a 

2 9 polypeptide which can be used a s a purification 

3 0 tag. 
31 



1 The term 'variant' nucleic acid as used herein 

2 encompasses all of these possibilities. When used in 

3 the context of polypeptides or proteins it indicates 

4 the encoded expression product of the variant 

5 nucleic acid. 
6 

7 Homology (i. e. similarity or identity) may be as 

8 defined using sequence comparisons are made using 

9 FASTA and FASTP (see Pearson &. Lipman, 1988. Methods 

10 in Enzymology 183 : 63 98) . Parameters are preferably 

11 set, using the default matrix, as follows : 

12 Gapopen (penalty for the first residue in a gap) 

13 12 for proteins/ -16 for DNA 

14 Gapext (penalty for additional residues in a gap) :- 

15 2 for proteins/ -4 for DNA 

16 KTUP word length : 2 for proteins/6 for DNA. 

17 Homology may be at the nucleotide sequence and/or 

18 encoded amino acid sequence level. Preferably, the 

19 nucleic acid and/or amino acid sequence shares at 

20 least about 60%, or 70%, or 80% homology, most 

21 preferably at least about 90%, 95%, 96%, 97%, 98% or 

22 99% homology with the sequence shown in Figure 4. 
23 

24 Thus a variant polypeptide in accordance with the 

25 present invention may include within the sequence 

26 shown in Figure 4, a single amino acid or 2 , 3, 4, 
2'7 5, 6, 7, 8, or 9 changes, about 10, 15, 20, 30, 40 
28 or 50 changes. In addition to one or more changes 

2 9 within the amino acid sequence shown, a variant 

3 0 polypeptide may include additional amino acids at 
31 the C terminus, and/or N- terminus . 

32 
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1 Naturally, regarding nucleic acid variants, changes 

2 to the nucleic acid which make no difference to the 

3 encoded polypeptide (i . e . 1 degeneratively 

4 equivalent ' ) are included within the scope of the 

5 present invention. 
6 

7 Preferred variants include one or more of the 

8 following changes (using the annotation of AF162687) : 

9 nucleotide 604 AAG causing an amino acid mutation of 

10 KAR; nucleotide 647 AAG, codon remains K, therefore 

11 a silent mutation; nucleotide 966 GAA causing an 

12 amino acid mutation of GAQ. 
13 

14 Changes to a sequence, to produce a derivative, may 

15 be by one or more of addition, insertion, deletion 

16 or substitution of one or more nucleotides in the 

17 nucleic acid, leading to the addition, insertion, 

18 deletion or substitution of one or more amino acids 

19 in the encoded polypeptide. Changes may be by way of 
2 0 conservative variation, i. e. substitution of one 

21 hydrophobic residue such as isoleucine, valine, 

2 2 leucine or methionine for another, or the 

23 substitution of one polar residue for another, such 

24 as arginine for lysine, glutamic for aspartic acid, 

25 or glutamine for asparagine. As is well known to 
2 6 those skilled in the art, altering the primary 

2 7 structure of a polypeptide by a conservative 

2 8 substitution may not significantly alter the 

29 activity of that * peptide because the side-chain of 

3 0 the amino acid which is inserted into the sequence 
31 may be able to form similar bonds and contacts as 
3 2 the side chain of the amino acid which has been 
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1 substituted out. This is so even when the 

2 substitution is in a region which is critical in 

3 determining the peptides conformation. 
4 

5 Also included are variants having non-conservative 

6 substitutions. As is well known to those skilled in 

7 the art, substitutions to regions of a peptide which 

8 are not critical in determining its conformation may 

9 not greatly affect its activity because they do not 

10 greatly alter the peptide's three dimensional 

11 structure. 
12 

13 In regions which are critical in determining the 

14 peptides conformation or activity such changes may 

15 confer advantageous properties on the polypeptide. 

16 Indeed, changes such as those described above may 

17 confer slightly advantageous properties on the 

18 peptide e. g. altered stability or specificity. 
19 

2 0 SNUT tags and vectors may be used in methods of 

21 purifying a soluble domain of a peptide. 

22 Accordingly in a further aspect of the invention, 

23 there is provided a method of producing a soluble 

24 bioactive domain of a protein, the method 

25 comprising the steps of cloning DNA encoding at 

26 least one candidate soluble domain into at least one 

27 expression vector, transfecting or transforming a 

28 host cell with said vector, expressing said DNA in 

29 said host cell, wherein said vector encodes a 

3 0 sortase gene product. 
31 



The sortase gene product is preferably in the form 
of a fusion protein. 

The method may comprise the steps of analysis of DNA 
coding for the protein of interest to identify 
antigenic soluble domains, designing oligonucleotide 
primers to amplify DNA encoding the domain, 
amplifying DNA, cloning the DNA, optionally 
screening clones for correct orientation of DNA, 
expressing DNA in expression strains, analysing 
expression products for solubility, analysing 
products and production of soluble bioactive protein 
domain. 

The method optionally comprises the step of 
producing a soluble bioactive protein domain of said 
protein of interest . 

The invention is exemplified with reference to the 
following non limiting description and the 
accompanying figures in which 

Figure 1 shows selected domains for amplification 
from in silico analysis. Representation of a 
candidate protein for the expression platform, in 
this case Jakl (human) . Four fragments have been 
chosen by analysis as depicted. 

Figure 2 shows denaturing dot -blot analysis of 
expression clones of fragments of MAR1 in pQE30. 
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1 Figure 3 shows a ribbon Diagram of Staphylcoccus 

2 aureus sortase. Ribbon diagram of the putative 

3 structure of S. aureus SrtA protein (minus its N- 

4 terminal membrane anchor) . SNUT represents the 

5 portion of this structure between the two yellow 

6 arrows as shown. The yellow ball signifies a Ca 2+ 

7 ion, essential for the biological activity of this 

8 protein. This diagram is taken from Ilangovan et 

9 a!., 2001 , PNAS 98 (11) 6056 
10 (doi : 10 . 1073/pnas . 101064198) 
11 

12 Figure 4 shows the Nucleotide Sequence and amino 

13 acid sequence of SNUT fragment 
14 

15 (a) This is the determined sequence of SNUT. The 

16 fragment was cloned into pQE3 0 using the SamHI site 

17 of this vector. When in the wanted orientation, 

18 insertion results in the inactivation of the 

19 upstream cloning site, therefore allowing any 
2 0 subsequent cloning of target inserts with the 

21 downstream BamHI site (see (b) for restriction map 

22 of sequence) . 
23 

24 Figure 5 illustrates qualitative purification 

25 results using the SNUT fusion tag. (a) shows the 

26 elution profile on SDS-PAGE of SNUT-Jakl using AKTA 

27 Prime native histag purification. Successful 

2 8 elution of SNUT-Jakl construct is signified by the 
29 white arrow. (b) shows the elution profile on SDS- 

3 0 PAGE of SNUT-MAR1 using AKTA Prime native histag 

31 purification. Successful elution is shown by the 

32 arrow, /(c). shows the same gel stained in (b) ; 



western blotted and detected using poly-histidine- 
HRP antibody. This is confirmation that the eluted 
species in (b) is actually SNUT-MARl, of expected 
molecular weight . 

Template analysis and primer design 

Analysis of the DNA coding for a protein of interest 
may be performed using software packages such as 
Vector NT I (Informax, USA) and 

BLASTP ( http :/ /www, ncbi .nlm. nih.gov/BLAST/) , p-fam ( 
www, Sanger . ac .uk/pf am) and TM pred 

(www . hgmp . mrc . ac . uk) which may be used to identify 
complete domains within the protein that 
significantly increase the likelihood of 
antigenicity and/or solubility when expressed as a 
subunit of the original protein coding sequence. 

In order to increase the possibility of identifying 
a soluble domain, preferably multiple sub-domains, 
more preferably at least three sub-domains, for 
example 3 to 9 sub-domains may be identified for 
processing. 

Oligonucleotide primers to amplify the selected sub- 
domains may be designed with the help of 
commercially avialable software packages such as the 
internet software package Primer3 ( http: //www- 
genome. wi .mit.edu/genome sof tware/other/primer3 .html 
(Whitehead Institute for Biomedical Research) , 
Vector NT I (www.informaxinc.com) and DNASIS (Hitachi 
Software Engineering Company ( www.oligo.net) . 
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1 

2 Typically primers for use in a method of the 

3 invention are in the range 10-50 base pairs in 

4 length, preferably 15 to 30, for example 20 base 

5 pairs in length, with annealing temperatures in the 

6 range 45-72 °C, more conveniently 55<-60 °C. Primers 

7 may be synthesised using standard techniques or may 

8 be sourced from commercial suppliers such as 

9 Invitrogen Life Technologies (Scotland) or MWG- 
10 Biotech AG (Germany) . 

11 

12 PCR of Insert 
13 

14 The desired inserts which encode the selected sub- 

15 domains are amplified using the primers designed 

16 • specifically for that target gene using standard PCR 

17 techniques. The template DNA for amplification can 

18 be in the form of plasmid DNA, cDNA or genomic DNA, 

19 depending on whatever is appropriate or indeed 

20 available. Any suitable DNA polymerase may be used, 

21 for example, Platinum Taq, Pfu ( www, stratagene . com ) 

22 or Pfx (www.invitrogen.com) . . Any suitable PCR 

23 system may be used, for example, the Expand High 

24 Fidelity PCR system (Roche, Basel, Switzerland) . 
25 

26 Several different thermocycler conditions may be 

27 used with each set of primers. This increases the 
2 8 chance of the PCR working without having to 

2 9 individually optimise each new primer set. Typically 

3 0 the following three programs may be used in the 
31 method: 



c 
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1. 



A standard PCR programme using the recommended 
annealing temperature provided with the 
primers . 



2 



3 



A standard PCR programme using 50°C as the 
temperature for annealing, 

A touchdown PCR programme, where the annealing 
temperature starts at a high temperature e.g 
65°C for 10 cycles and then gradually decreases 
the annealing temperature to 5 0°C over the 
subsequent e.g 15 cycles. 



Y2 Buffer conditions may be adjusted as required, for 

13 example with respect to magnesium ion concentration 

14 or addition of DMSO for the amplification of 

15 difficult templates. Further details of a suitable 

16 purification method which may be used with the 

17 vector or tag of the invention can be found in our 

18 co-pending PCT application, filed on the same day as 

19 this application and claiming priority from GB 

20 0131026.7. 
21 

22 The PCR products may be visualised using standard 

23 techniques, for example on a 1.5% agarose gel 

24 stained with Ethidium Bromide and the bands are cut 

25 out of the gel and purified using Mini elute gel 
2 6 extraction Kit (Qiagen, Crawley, England) . 

27 

2 8 Expression Vectors 
29 

3 0 Amplified DNA inserts may be cloned into expression 

31 vectors using techniques dictated by the multiple 

32 cloning sites of the ; vector in question. Such 
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techniques are readily available to the skilled 
person. 

Any suitable expression system can be used in the 
invention. Preferably, the expression system is 
prokaryotic. Suitable vectors for use in the method 
of the invention include any vector which can encode 
SNUT. [Solubility eNhancing Unique Tag] , for example 
pSNUT. This tag is based on the sequence of a trans- 
peptidase found on the surface of gram-positive 
bacteria. This protein is highly soluble, and 
expressed as very high levels. 

The inventors have found that SNUT is an ideal 
fusion tag for conferring solubility and expression 
levels to target protein fragments. SNUT may be 
cloned into any suitable vector. For the purposes of 
the examples shown in this application, the sequence 
incorporating the SNUT fragment is cloned into pQE3 0 
(Qiagen, Valencia, CA) in a manner allowing full use 
of the multiple cloning site (MCS) of this vector 
for downstream gene insertions . 

Development of pSNUT 

The inventors found that a tag based on the srtA 
gene product from Staphylococcus aureus is highly 
soluble, reacts well to purification schemes and 
expresses particularly well. it was hypothesised 
that the incorporation of a portion or domain of 
this protein could represent a useful fusion tag in 
the present, method, and indeed the expression of . any 



1 poorly soluble protein in E. coli. Using NMR 

2 studies, the 3D structure of this protein has been 

3 predicted and is shown in Figure 3 . We hypothesised 

4 that by taking a portion of this structure, we could 

5 make a manipulateable protein tag, but not disturb 

6 its tertiary structure enough to reduce its highly 

7 favourable characteristics listed above. The region 

8 of this protein used as a solubility-enhancing tag 

9 is depicted by two arrows . 
10 

11 The SNUT tag was cloned into pQE30. However, it may 

12 be cloned into any suitable expression vector, 

13 Positive clones may be identified by denaturing dot 

14 blots, SDS-PAGE and Western blotting. Final 

15 confirmation of these clones was provided by DNA 

16 sequencing, and the sequence of the multiple cloning 

17 region of the resultant vector is shown in Figure 4. 
18 

19 Variances in the sequence of the SNUT domain were 

2 0 observed from the sequence for SrtA that has been 

21 logged in Genbank (AF162687) . The variances are 

22 (using the annotation of AF162687) nucleotide 604 

23 AAG causing an amino acid mutation of KAR; 

24 nucleotide 647 AAG, codon remains K, therefore a 

25 silent mutation; nucleotide 966 GAA causing an amino 

26 acid mutation of GAQ. 
27 

2 8 Preliminary trials and native purification showed 

2 9 that the SNUT fragment was very soluble and its 

3 0 characteristics were in no way diminished by 

31 truncation, thus showing that SNUT could represent a 

32 usefui tag domain (data not shown) . As "described in 



1 the Examples, to fully test the abilities of SNUT, 

2 we then chose two proteins were soluble protein 

3 production had proved impossible using conventional 

4 methods and using the other expression systems of 

5 the method of the present invention. Surprisingly, 

6 we found that, using pSNUT in the method of the 

7 invention, these proteins could be produced in 

8 soluble form. 
9 

10 Clone Propagation 
11 

12 Target insert /expression vector ligations may be 

13 propagated using standard transformation techniques 

14 including the use of chemically competent cells or 

15 electro- competent cells. The choice of the host 

16 cell and strain for transformation is dependent on 

17 the characteristics of the expression vectors being 

18 utilised. 
19 

20 Bacterial cells, for example, Escherichia coli, are 

21 the preferred host cells. However, any suitable 

22 host cell may be used. In preferred embodiments , the 

23 host cells are Escherchia. coli. 
24 

25 The vectors may be used to each transfect or 

26 transform a plurality of different host cell 

27 strains. The set of host cell strains for 

28 individual vector may be the same or different from 

29 the set used with other vectors. " 
30 

31 In a particularly preferred embodiment of the 

3-2 invention, each vector may be transformed into three 



1 E. coli strains (for example, selected from 

2 Rosetta(DE3)pLacI, Tuner (DE3 ) pLacI , Origami BL21 

3 (DE3)pLacI and TOP10F, Qiagen) . 
4 

5 Where the vectors are pQE based vectors, TOP10F' 

6 cells are preferred for the propagation and 

7 expression trials of such vectors. The present 

8 inventors have identified this strain as a more 

9 superior strain for these vectors than either of the 

10 recommended strains by the supplier (M15 and 

11 SG13 009) , in terms of ease of use and culture 

12 maintenance (only one antibiotic required as to two 

13 with M15 or SG13009 (www.quiagen.com).. Other F' 

14 strains such as XL1 Blue can be used, but are 

15 inferior to the TOP10F' strain, due to lack of 

16 expression regulation (results not shown) . The use 

17 of TOP10F' (Invitrogen) for the propagation and/or 

18 expression pQE based vectors forms an independent 

19 aspect of the present invention. Other F' strains 
2 0 such as XL1 Blue may also be used, but are inferior 
21 to the TOP10F' . 

22 

23 After transformation, cells may be plated out onto 

24 selection plates and propagated for the development 

25 of single colonies using standard conditions. 
26 

2 7 Propagation of Cells 
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2 The colonies may be used to inoculate duplicate 

3 wells in a 96 well plate. 
4 

5 Typically, each well may contain 200 jllI of LB broth 

6 with the appropriate antibiotics. Each plate may be 

7 dedicated to one strain of E. coli or other host 

8 cell which alleviates the problems of different 

9 growth rates. The necessary controls are also 

10 included on each plate. The plates are then grown 

11 up, preferably at 3 7°C or any other temperature as 

12 appropriate to the particular host -cell and vector, 

13 with" shaking, until log phase is reached. This is 

14 the primary plate. 
15 

16 From the primary plate a secondary plate is seeded 

17 and then grown. Typically, the secondary plate is 

18 be seeded using 'hedgehog 7 replicators and then 

19 grown up to, for example, log phase, chilled to 16 b C 

20 .for 1 hour. Determination of positive clones from 

21 these plates may be undertaken using functional 

22 studies. Routinely, 6-48 clones for each insert- 

23 vector ligation are taken and propagated in culture 

24 micro-titre plates containing up to 500 ]il of media. 

25 According to the conditions and reagents required, 

26 protein production is then induced, and cultures 
2 7 propagated further. Most vectors are under the 

2 8 control of a promoter such as T7, T7lac or T5, and. 

2 9 can be easily induced with IPTG during log phase 

3 0 growth. Typically, cultures are propagated in a 

31 peptone -based media such as LB or 2YT supplemented 

32 with the relevant antibiotic selection marker. 
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1 These cultures are grown at temperatures ranging 

2 from 4-40 °C, but more frequently in the range of 

3 20-37 °C depending on the nature of the expressed 

4 protein, with or without shaking and induced when 

5 appropriate with the inducing agent (usually log or 

6 early stationary phase) . After induction, growth 

7 propagation can be continued for 1-16 hours for a 

8 detectable amount of protein to be produced. 
9 

10 The primary plate is preferably stored at 4°C until 

11 the process is complete. 

12 .. 

13 Colony Screening for Inserts in Correct Orientation 

14 The method of the invention may include the step of 

15 testing transf ormants for correct orientation of the 

16 inserts. Identification of positive clones can be 

17 achieved through a variety of methods, including 

18 standard techniques such as digestion analysis of 

19 plasmid DNA; colony PCR and DNA sequencing. 

20 Alternatively, dot-blotting may be used for the 

21 identification of positive clones for example, using 

22 a BioDot apparatus (BioRad) containing 

23 nitrocellulose membrane (0.45jjiM pore size) in 

24 accordance with the manufacturers' instructions, 
2 5 prior to final confirmation by DNA sequencing. 
26 

27 The use of this dot blotting method in the platform 

28 represents a rapid, reproducible and robust 

2 9 detection method. This particular method is useful 

3 0 for the rapid detection or presence of recombinant 

31 protein and allows for a determination of all clones 

32 irrespective of solubility and conformation:. This 
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may be important at this stage, because 
conformational structures can inhibit the detection 
of tag domains if they are not presented properly on 
the surface of the protein. This can occur as 
easily with both soluble and insoluble protein. 

As described above, standard colony PCR techniques 
may be used. For example, transf ormants may be 
selected, either manually or using automation such 
as the Cambridge BioRobitics BioPick instrument, and 
screened using directional PCR- using a primer that 
encodes for a sequence on the vector such as S Tag 
or GATA sequence, and then the complementary primer 
from the insert. A PCR mix may be used such as the 
RedTaq DNA Polymerase (Sigma Aldrich, Dorset, 
England) and the thermocycler conditions used may be 
the standard PCR programme using 50°C as the 
annealing temperature or adjusted as required. 

Although all colony selecting and picking can be 
done manually, automated colony pickers are 
preferred. Automated colony pickers such as the 
BioRobotics BioPick allow for the uniform and 
reproducible selection of clones from transformation 
plates. Clone selection determinants can be set to 
ensure picking colonies of a standardised size and 
shape. After picking and plate inoculation, 
propagation of clones can be carried out as 
described above. 

Identification of positive clones can be achieved 
through a variety of methods, including standard 



1 techniques such as digestion analysis of plasmid 

2 DNA; colony PCR and DNA sequencing Alternatively, in 

3 a preferred embodiment, the novel method of dot- 

4 blotting described herein for the identification of 

5 positive clones may be used in place of such 

6 traditional techniques, prior to final confirmation 

7 by DNA sequencing. The use of this method in the 

8 platform presented here is not essential in the use 

9 of this platform over existing screening 

10 methodologies, but represents a rapid, reproducible 

11 and robust detection method. The protocol described 

12 here is a new protocol for an existing method for 

13 which commercially available equipment (Bio-Rad 

14 DotBlot) can be purchased* 
15 

16 This particular method is useful for the rapid 

17 detection or presence of recombinant protein and 

18 allows for a determination of all clones 

19 irrespective of solubility and conformation. This 
2 0 is useful at this stage, because conformational 

21 structures can inhibit the detection of tag domains 

22 if they are not presented properly on the surface of 

23 the protein. This can occur as easily with both 

24 soluble and insoluble protein. 
25 

26 For example, after growth on the micro-titre plates 

27 is complete, the plate is centrifuged at 4 000 rpm 

28 for 10 minutes at 4°C to harvest the bacterial 

29 cells. The supernatant is removed and the cell 

30 pellets are re-suspended in 50 ]xl lysis buffer (10 

31 mM Tris.HCl, pH 9.0, ImM EDTA, 6 mM MgCl 2 ) 

32 containing benzonase (1 u 1 /™ 1 ) • T ^e plate is 



1 subsequently incubated at 4°C with shaking for 30 

2 minutes. A sample (10 ]il) of the cell lysate is 

3 added to 10 0 ]il buffer (8 M urea, 500 mM NaCl, 20 mM 

4 sodium phosphate, pH 8.0) and incubated at room 

5 temperature for 20 minutes. Samples are then 

6 applied to a BioDot apparatus (BioRad) containing 

7 nitrocellulose membrane (0.45]aM pore size) in 

8 accordance with the manufacturers' instructions. 

9 The membrane is removed and .transferred into 

10 blocking reagent (3% w/v; Bovine serum albumin in 

11 TBS) for 3 0 minutes at room temperature. The blot 

12 is washed briefly with TBS then incubated in a 

13 primary antibody, specific to the tag being used for 

14 the subset of expression clones. Depending on. the 

15 nature of the primary i.e., whether or not it has a 

16 horse radish peroxidase (HRP) reporter function, 

17 will depend on whether the use of a secondary is. 

18 required. For detection of specific binding the 

19 membrane is then washed 2x 5 minutes in TBS followed 

20 by lx 5 minute wash in 10 mM Tris.HCl pH7 . 6 . 

21 Detection of specifically bound antibody is 

22 disclosed by the addition of chromogenic substrate 

23 (6 mg diamihobenzidine in 10 ml 10 mM Tris.HCl pH 

24 7.6 containing 5 0 yil 6% H 2 0 2 ) . The reaction is 

25 stopped by thorough rinsing in water. Positive 

26 clones identified by this procedure can then be 

27 confirmed by DNA sequencing of the expression 

28 construct using now industry- standard techniques and 

29 equipment such as ABI and Amersham Biosciences. 
30 

31 Sequencing 

32 



1 The sequencing reactions may be performed using 

2 techniques common in the art using any suitable 

3 apparatus. For example, sequencing may be performed 

4 on the cloned inserts, using the Big Dye Terminator 

5 cycle sequencing kits (Applied Biosystems, 

6 Warrington, UK) and the specific sequencing primer 

7 run on a Peltier Thermal cycler model PTC225 (MJ 

8 Research Cambridge, Mass) . The reactions may be run 

9 on Applied Biosystems - Hitachi 3310 Sequencer 

10 according to the manufacturer's instructions. These 

11 sequences are checked to ensure that no PCR 

12 generated errors have occurred. 
13 

14 Assessment of Solubility of Positive Clones 
15 

16 The cells of positive clones may be harvested and 

17 soluble and insoluble protein detected. 
18 

19 Any suitable techniques known in the art can be used 

20 to separate soluble and insoluble protein, such as 

21 the use of centrif ugation, magnetic bead 

22 technologies and vacuum manifold filtrations. 

23 Typically, however, the separated proteins are 

24 ultimately analysed by acrylamide gel and western 

25 blotting. This confirms the' presence of recombinant 

26 protein at the correct size. 
27 

28 In one embodiment, contents of each well in the 96 

29 well plate are transferred into a Millipore 0.65 p.m 

30 multi-screen plate. The plate is placed on a vacuum 

31 manifold and a vacuum is applied. This draws off 

32 the : culture medium to waste. The cells are then 



25 

washed with PBS (optional) , again the vacuum is 
applied to remove the PBS. The multi-screen plate i 
removed from the manifold and bacterial cell lysis 
buffer (containing DNAse) (50 ul) is added to each 
well. The plate is incubated at room temperature 
for 30 minutes with shaking to facilitate lysis of 
the cells. A fresh 96 well microtitre plate (ELIjSA 
grade) is placed inside the vacuum manifold and the 
multi-screen plate is placed above it. When a 
vacuum is applied the contents of each well are 
drawn into the micro- titre plate below. The vacuum 
only needs to be applied for 2 0 seconds. The 
collected lysate contains the soluble fraction of 
expressed protein. A sample of the collected lysate 
may subsequently analysed by SDS-PAGE and Western 
blotting to confirm both the presence and correct 
molecular weight of the target protein. 

The use of SDS-PAGE and Western blotting can be 
expensive and time consuming, especially when 
numerous samples must be analysed for each 
construct. In light of this we have developed a 
protocol whereby one gel can be used for both total 
protein staining and western blotting. This 
represents a significant improvement in this 
methodology and obviously allows cost saving, and 
precise comparisons can be made with regard to total 
protein and western blotting as both sets of results 
come from the one gel . 

The basis of this protocol is in the ability to use 
chloroform and UV light to stain protein on an SDS- 



1 PAGE gel (Kazmin et al . , Anal Biochem, 2001, 301 (1) 

2 91-6; doi:10. 1006/abio. 2001. 5488) . We have used 

3 this technique to great effect as it allows for the 

4 extremely rapid staining of a SDS-PAGE gel in less 

5 than a tenth of the time taken using other more 

6 traditional staining methods such as Commassie 

7 Brilliant Blue and Collodial Blue stains. We then 

8 decided to take this observation a step further and 

9 analyse the ability of a chloroform- stained gel to 

10 be used in Western blotting. This would not be 

11 expected to work as other stained gels result in the 

12 fixing of the protein to the gel and subsequent 

13 inability to transfer the protein during blotting. 

14 This expectation is coupled to the fact that 

15 chloroform is not compatible with western blotting 

16 equipment (Bio-Rad SD blotter user's manual) . 

17 However, fortuitously, we have discovered that with 

18 a wash of the chloroform-stained gel in double- 

19 distilled water, to remove excess chloroform, and 
2 0 after subsequent soaking in transfer buffer, 

21 proteins were effectively transferred during western 

22 blotting in contrast to expectations. This transfer 

23 was no-less effective 'than from a gel that has not 

24 been pre-stained with chloroform and UV light. 

2 5 Figure 6 primarily shows results relating to the 

2 6 production of soluble protein by the platform, but 

2 7 also shows the ability to use the chloroform- stained 

28 SDS-PAGE derived western blot for the identification 

2 9 of proteins, without any apparent damage caused to 

3 0 the proteins. 
31 
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1 The use of a chloroform- stained SDS-PAGE derived 

2 western blot for the identification of proteins 

3 forms another aspect of the present invention. 
4 

5 Scale-Up and Purification 

6 

7. This analysis provides a picture of the expression 

8 status of the clones on each plate. Using this 

9 analysis, positive soluble protein expressing clones 

10 can be identified for the production of soluble 

11 recombinant protein for a given target protein. The 

12 clones may be selected and their growth scaled up 

13 e.g. to 5 ml scale, using the saved primary plate as 

14 an inoculum. Parameters that may be taken into 

15 consideration in deciding on the appropriate culture 

16 to select for scale-up include the desirability of 

17 specific regions for the production of an antigen, 

18 the overall expression levels of the clone and 

19 factors that may affect affinity purification such 
2 0 as amino acid composition. 

21 

22 Example 1. Expression construct design 

23 

24 Figure 1 is a diagrammatic representation of the 

25 protein Jakl . Using pfam, the position of distinct 

26 "domains was established. Further analysis of these 
2 7 domains was then carried out using Tmpred and the 

2 8 Kyle and Dolittle hydrophobicity algorithm to 

2 9 determine the usefulness of these domains as soluble 

30 antigens. From this tentative analysis, four 

31 domains were selected for amplification and 

32 expression analysis. Based ; on this preliminary in 



1 silico analysis, primers specific for a target 

2 protein were designed and used to amplify domains 

3 selected for analysis . 
4 

5 Vectors (500 ng) were restricted with BamHI (20 

6 units) and Sail (20 units) in the presence of calf 

7 intestinal alkaline phosphatase (CIP) (2 units), gel 

8 purified and quantified using standard methods. 

9 Purified PCR fragments (100 ng) were restricted with 

10 BamHI (5 units) and Sail 5 units) , gel purified, 

11 quantified, and then used in a ligation reaction 

12 with the restricted vector again using standard T4 

13 DNA ligase methods (Ready-to-Go T4 DNA ligase, 

14 Amersham Biosciences) . A sample of the ligation 

15 reaction (1 \xl) was then used to transform the 

16 appropriate competent bacterial cells (TOP10F 7 were 

17 used here for the pQE based vectors, a modification 

18 of the manufacturers recommendations; BL21 (DE3 ) pLysE 

19 for pET43.1a and TOP10F' for pGEX-Fus) . 

20 Transf ormants were selected on LB/ampicillin (100 

21 pg/ml) overnight at 28 °C. 
22 

23 A Cambridge BioRobitics BioPick instrument was used 

24 for the picking of 24 colonies from each of the 

25 transf ormant plates into flat -bottomed and lidded 

26 micro-titre plates. The clones were used to 

27 inoculate 150 \xl of LB (containing 100]ig/ml 

28 ampicillin) , and these were allowed to grow 

29 overnight at 3 7 °C. 
30 

31 A secondary plate was prepared by the inoculation of 

3 2 2 0 0. |il of LB containing the required supplements 
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1 with 10 ]il of the overnight primary culture. These 

2 were then grown at 3 7 °C Once an optical density 

3 (OD) of 0.25 at A550 was reached, IPTG (final 

4 concentration, 1 mM) was added to induce expression 

5 of the recombinant protein. Culture propagation was 

6 continued for another 4 hours prior to harvesting of 

7 bacterial cells. 
8 

9 After clones expressing specific recombinant . protein 

10 have been identified, the solubility of these 

11 proteins has to be established prior to clone 

12 selection for purification. This can be performed a 

13 number of ways including the use of centrif ugat ion 

14 and automation- friendly vacuum manifold separations. 

15 The results here were obtained using methodologies 

16 based around the use of vacuum-assisted filtration 

17 to separate soluble and insoluble protein. The 

18 filtrates that were produced from the method 

19 described were then analysed by SDS-PAGE and Western 
2 0 blotting to confirm the' production of a recombinant 
21 protein of the correct anticipated molecular weight. 
22 

23 Example 2 Design and Construction of SNUT Expression 

24 Tag 

25 

26 Based on analysis of the amino acid sequence and 

27 predicted structure of SrtA^N/ it was decided to 

2 8 amplify the region of amino acids 26 to 171 of the 
29 SrtA sequence. Amplification was conducted using 

3 0 the forward primer 5' TTTTTTAGATCTAAACCACATATCGAT 
31 and the reverse primer 5' 

3 2 TTTTTTGGATCCATCTAGAACTTCTAC . This product was ;then 



digested with Bgll and BamHI and ligated into pQE3 0 
vector which had also been digested with BamHI to 
form the pSNUT vector. The ligation mix was 
transformed into TOP10F' cells and single colonies 
propagated on LB agar containing 10 0 ug/ml 
ampicillin. Clones with the srtA fragment in the 
correct orientation were screened by expression 
analysis and positive clones identified using the 
denaturing dot-blot assay described earlier. 

The sequence encoding the SNUT tag was cloned into 
PQE3 0 as described earlier and positive clones 
identified by denaturing dot blots, SDS-PAGE and 
Western blotting. Final confirmation of these 
clones was provided by DNA sequencing, and the 
sequence of the multiple, cloning region of the 
resultant vector is shown in Figure 4 . Variances in 
the sequence of the SNUT domain were observed from 
the sequence for SrtA that has been logged in 
Genbank (AF162687) . The variances are (using the 
annotation of AF162687) nucleotide 604 AAG causing 
an amino acid mutation of KAR; nucleotide 64 7 AAG, 
codon remains K, therefore a silent mutation; 
nucleotide 966 GAA causing an amino acid mutation of 
GAQ. 

Example 3 Trials of SNUT Expression Constructs 

Target inserts were cloned into the pSNUT vector 
using primer construction and digestion of resulting 
PCR amplifications with BamHI and Sail as described 
earlier. pSNUT was digested with BamHI in a similar 
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1 manner and the target inserts cloned as described. 

2 Clones were screened using the denaturing dot -blot 

3 system and then analysed with SDS-PAGE and western 

4 blotting. Positive clones were used for preparative 

5 200 ml LB cultures containing 100 ug/ml ampicillin 

6 and induced as described earlier. This was grown to. 

7 an optical density of 0.5 at A 55 o at 37 °C. 

8 Expression of SNUT was then induced with the 

9 addition of IPTG (final concentration, 1 mM) and 

10 left to grow for another 4 hours. Cells were then 

11 harvested by centrif ugation at 5K rpm for 15 

12 minutes. Cells were re-suspended in 3 0 ml PBS 

13 containing 0.1% Igepal and lysis induced by two 

14 freeze-thaw cycles. The suspension was then 

15 sonicated and centrif uged at 5K rpm for 15 minutes. 

16 The soluble supernatant was transferred to a fresh 

17 container and filtered through a 0 . 8 \im disc filter 

18 to remove final cell debris. This solution was then 

19 applied to a Ni 2+ charged IMAC column (Amersham 

2 0 Biosciences HiTrap Chelating column, 1 ml) using an 

21 AKTA Prime low pressure chromatography system and 

22 column was then treated using a standard native his- 

23 tag purification protocol involving washing of 

24 column with 2 0 mM sodium dihydrogen phosphate pH 8 . 0 

25 containing 10 mM imidazole, 500 mM NaCl, and elution 

26 of soluble his -tagged proteins using 2 0 mM sodium 

27 dihydrogen phosphate pH 8 . 0 containing 500 mM 

28 imidazole, 500 mM NaCl . Elution fractions were then 
2 9 analysed on an SDS-PAGE gel (4-2 0% SDS-PAGE Bio-Rad 

30 Criterion gel) , which was stained with chloroform as 

31 described earlier. This gel was then subsequently 

32 western blotted and the his-tagged protein detected 
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1 with anti-poly-histidine monoclonal antibody using 

2 the techniques described herein. 
3 

4 Preliminary trials and native purification showed 

5 that the SNUT fragment was very soluble and its 

6 characteristics were in no way diminished by 

7 truncation, thus showing that SNUT could represent a 

8 useful tag domain (data not shown) . To fully test 

9 the abilities of SNUT, we then chose two proteins 

10 for which soluble protein production had proved 

11 impossible using the other expression systems in 

12 which SNUT was not used as a tag. These were murine 

13 MAR1 and human Jakl . Clones were prepared and 

14 selected using the method as described in the 

15 Examples above and positive clones were subsequently 

16 grown and induced at 3 7 °C, These were then treated 

17 to identical native histag purifications. Both 

18 proteins behaved very favourably under standard 

19 purification conditions as can be seen from the 

20 purification profiles in Figure 5. For both these 

21 trial proteins, this was the first example of such 

22 purification under soluble conditions. The 

23 production of these proteins using conventional 

24 techniques has failed to produce any soluble 

25 protein, irrespective of expression system or growth 
2 6 conditions used (data not shown) . However, as 

2 7 described in this example, when the protein 

2 8 fragments were expressed in pSNUT, soluble proteins 

2 9 can be surprisingly obtained. 
30 

31 The effectiveness of SNUT as a fusion protein is 

3 2 even more significant when it is considered that no 
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1 special growth conditions were required for the 

2 generation of soluble protein. This is remarkable 

3 when one considers the protein expressionist's 

4 standard GST tag which is not even soluble itself 

5 when expressed at 37 °C; 28 °C is required before 

6 even the generation of GST on its own without any 

7 target protein is observed. 
8 

9 All documents referred to in this specification are 

10 herein incorporated by reference. Various 

11 modifications and variations to the described 

■ 12 embodiments of the inventions will .be .apparent to 

13 those skilled in the art without departing from the 

14 scope and spirit of the invention. Although the 

15 invention has been described in connection with 
.16 specific preferred embodiments, it should be 

17 understood that the invention as claimed should not 

18 be unduly limited to such specific embodiments. 

19 Indeed, various modifications of the described modes 

20 of carrying out the invention which are obvious to 

21 those skilled in the art are intended to be covered 

22 by the present invention. - 
23 



Use of a sortase gene product as a purification 
tag. 

The use according to claim 1 wherein the 
sortase gene product is a Staphylococcus aureus 
srtA gene product. 

The use according to claim 1 or claim 2 wherein 
the sortase gene product is encoded by the 
nucleotide sequence shown in Figure 4 or a 
variant or fragment thereof. 

The use according to any one of claims 1 to 3 
wherein the sortase gene product comprises 
amino acids 2 6 to 171 of the SrtA sequence 
shown in Figure 4 or. a variant or fragment 
thereof . 

An expression construct for the production of 
recombinant polypeptides, which construct 
comprises an expression cassette consisting of 
the following elements that are operably 
linked: a) a promoter; b) the coding region of 
a DNA encoding a sortase gene product as a 
purification tag sequence; and c) a cloning 
site for receiving the coding region for the 
recombinant polypeptide to be produced; and d) 
transcription termination signals. 



The expression construct according to claim 5 
wherein the sortase gene product is a 
Staphylococcus aureus srtA gene product. 

The expression construct according to claim 5 
or claim 6 wherein the sortase gene product is 
encoded by the nucleotide sequence shown in 
Figure 4 or a variant or fragment thereof. 

The expression construct according to any one 
of claims 5 to 7 wherein the sortase gene 
product comprises amino acids 26 to 171 of the 
SrtA sequence shown in Figure 4 or a variant or 
fragment thereof . 

A method for producing a polypeptide, 
comprising: a) preparing an expression vector 
for the polypeptide to be produced by cloning 
the coding sequence for the polypeptide into 
the cloning site of an expression construct as 
claimed in any one of claims 5 to 8 ; b) 
transforming a suitable host cell with the 
expression construct thus obtained; and c) 
culturing the host cell under conditions 
allowing expression of a fusion polypeptide 
consisting of the amino acid sequence of the 
purification tag with the amino acid sequence 
of the polypeptide to be expressed covalently 
linked thereto; and d) isolating the fusion 
polypeptide from the host cell or the culture 
medium by means of binding the fusion 



polypeptide present therein through the amino 
acid sequence of the purification tag. 

The method according to claim 9, wherein the 
sortase gene product is a Staphylococcus aureus 
srtA gene product. 

The method according to claim 9 or claim 10 
wherein the sortase gene product is encoded by 
the nucleotide sequence shown in Figure 4 or a 
variant or fragment thereof. 

The method according to any one of claims 9 to 
11 wherein the sortase gene product comprises 
amino acids 2 6 to 171 of the SrtA sequence 
shown in Figure 4 or a variant or fragment 
thereof . 

A fusion polypeptide obtained by the method of 
any one of claims 9 to 12. 

A purification tag comprising a sortase gene 
product . 

The purification tag according to claim 14 
wherein the gene product is a Staphylococcus 
aureus srtA gene product. 

The purification tag according to claim 14 or 
claim 15 wherein the sortase gene product is 
encoded by the nucleotide sequence shown in 
Figure 4 or a variant or fragment thereof. 



The purification tag according to any one of 
claims 14 to 16 wherein the sortase gene 
product comprises amino acids 26 to 171 of the 
SrtA sequence shown in Figure 4 or a variant or 
fragment thereof. 
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